18 research outputs found

    B²N²: Resource efficient Bayesian neural network accelerator using Bernoulli sampler on FPGA

    Get PDF
    A resource efficient hardware accelerator for Bayesian neural network (BNN) named B²N², Bernoulli random number based Bayesian neural network accelerator, is proposed. As neural networks expand their application into risk sensitive domains where mispredictions may cause serious social and economic losses, evaluating the NN’s confidence on its prediction has emerged as a critical concern. Among many uncertainty evaluation methods, BNN provides a theoretically grounded way to evaluate the uncertainty of NN’s output by treating network parameters as random variables. By exploiting the central limit theorem, we propose to replace costly Gaussian random number generators (RNG) with Bernoulli RNG which can be efficiently implemented on hardware since the possible outcome from Bernoulli distribution is binary. We demonstrate that B²N² implemented on Xilinx ZCU104 FPGA board consumes only 465 DSPs and 81661 LUTs which corresponds to 50.9% and 14.3% reductions compared to Gaussian-BNN (Hirayama et al., 2020) implemented on the same FPGA board for fair comparison. We further compare B²N² with VIBNN (Cai et al., 2018), which shows that B²N² successfully reduced DSPs and LUTs usages by 50.9% and 57.9%, respectively. Owing to the reduced hardware resources, B²N² improved energy efficiency by 7.50% and 57.5% compared to Gaussian-BNN (Hirayama et al., 2020) and VIBNN (Cai et al., 2018), respectively

    Efficient Aging-aware Failure Probability Estimation Using Augmented Reliability and Subset Simulation

    Get PDF
    A circuit-aging simulation that efficiently calculates temporal change of rare circuit-failure probability is proposed. While conventional methods required a long computational time due to the necessity of conducting separate calculations of failure probability at each device age, the proposed Monte Carlo based method requires to run only a single set of simulation. By applying the augmented reliability and subset simulation framework, the change of failure probability along the lifetime of the device can be evaluated through the analysis of the Monte Carlo samples. Combined with the two-step sample generation technique, the proposed method reduces the computational time to about 1/6 of that of the conventional method while maintaining a sufficient estimation accuracy

    Modular DFR: Digital Delayed Feedback Reservoir Model for Enhancing Design Flexibility

    Full text link
    A delayed feedback reservoir (DFR) is a type of reservoir computing system well-suited for hardware implementations owing to its simple structure. Most existing DFR implementations use analog circuits that require both digital-to-analog and analog-to-digital converters for interfacing. However, digital DFRs emulate analog nonlinear components in the digital domain, resulting in a lack of design flexibility and higher power consumption. In this paper, we propose a novel modular DFR model that is suitable for fully digital implementations. The proposed model reduces the number of hyperparameters and allows flexibility in the selection of the nonlinear function, which improves the accuracy while reducing the power consumption. We further present two DFR realizations with different nonlinear functions, achieving 10x power reduction and 5.3x throughput improvement while maintaining equal or better accuracy.Comment: 20 pages, 11 figures. Accepted for publication in the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems (CASES) 2023. Will appear in ACM Transactions on Embedded Computing Systems (TECS

    Visualizing Phonotactic Behavior of Female Frogs in Darkness

    Get PDF
    Many animals use sounds produced by conspecifics for mate identification. Female insects and anuran amphibians, for instance, use acoustic cues to localize, orient toward and approach conspecific males prior to mating. Here we present a novel technique that utilizes multiple, distributed sound-indication devices and a miniature LED backpack to visualize and record the nocturnal phonotactic approach of females of the Australian orange-eyed tree frog (Litoria chloris) both in a laboratory arena and in the animal’s natural habitat. Continuous high-definition digital recording of the LED coordinates provides automatic tracking of the female’s position, and the illumination patterns of the sound-indication devices allow us to discriminate multiple sound sources including loudspeakers broadcasting calls as well as calls emitted by individual male frogs. This innovative methodology is widely applicable for the study of phonotaxis and spatial structures of acoustically communicating nocturnal animals

    トランジスタのBTI劣化ばらつきに関する研究:特性評価からSRAM 回路歩留り予測へ

    No full text
    京都大学0048新制・課程博士博士(情報学)甲第19862号情博第613号新制||情||106(附属図書館)32898京都大学大学院情報学研究科通信情報システム専攻(主査)教授 佐藤 高史, 教授 小野寺 秀俊, 教授 髙木 直史学位規則第4条第1項該当Doctor of InformaticsKyoto UniversityDFA

    Pay Attention via Quantization: Enhancing Explainability of Neural Networks via Quantized Activation

    Get PDF
    Modern deep learning algorithms comprise highly complex artificial neural networks, making it extremely difficult for humans to track their inference processes. As the social implementation of deep learning progresses, the human and economic losses caused by inference errors are becoming increasingly problematic, making it necessary to develop methods to explain the basis for the decisions of deep learning algorithms. Although an attention mechanism-based method to visualize the regions that contribute to steering angle prediction in an automated driving task has been proposed, its explanatory capability is low. In this paper, we focus on the fact that the importance of each bit in the activation value of a network is biased (i.e., the sign and exponent bits are weighted more heavily than the mantissa bits), which has been overlooked in previous studies. Specifically, this paper quantizes network activations, encouraging important information to be aggregated to the sign bit. Further, we introduce an attention mechanism restricted to the sign bit to improve the explanatory power. Our numerical experiment using the Udacity dataset revealed that the proposed method achieves a 1.14×1.14\times higher area under curve (AUC) in terms of the deletion metric

    Efficient Aging-Aware Failure Probability Estimation Using Augmented Reliability and Subset Simulation

    No full text

    Efficient aging-aware SRAM failure probability calculation via particle filter-based importance sampling

    Get PDF
    An efficient Monte Carlo (MC) method for the calculation of failure probability degradation of an SRAM cell due to negative bias temperature instability (NBTI) is proposed. In the proposed method, a particle filter is utilized to incrementally track temporal performance changes in an SRAM cell. The number of simulations required to obtain stable particle distribution is greatly reduced, by reusing the final distribution of the particles in the last time step as the initial distribution. Combining with the use of a binary classifier, with which an MC sample is quickly judged whether it causes a malfunction of the cell or not, the total number of simulations to capture the temporal change of failure probability is significantly reduced. The proposed method achieves 13:4× speed-up over the state-ofthe-art method

    Low Latency 256-bit Fp\mathbb{F}_p ECDSA Signature Generation Crypto Processor

    No full text
    corecore